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Abstract 

A search for Higgs boson pair production pp —> hh is performed with 19.5 fb -1 of proton- 
proton collision data at yfs = S TeV, which were recorded by the ATLAS detector at the 
Large Hadron Collider in 2012. The decay products of each Higgs boson are reconstructed 
as a high-momentum bb system with either a pair of small-radius jets or a single large- 
radius jet, the latter exploiting jet substructure techniques and associated /Magged track- 
jets. No evidence for resonant or non-resonant Higgs boson pair production is observed. 
The data are interpreted in the context of the Randall-Sundrum model with a warped extra 
dimension as well as the two-Higgs-doublet model. An upper limit on the cross-section for 
pp —» G kk —* hh —> bbbb of 3.2 (2.3) fb is set for a Kaluza-Klein graviton G KK mass of 
1.0 (1.5) TeV, at the 95% confidence level. The search for non-resonant Standard Model 
hh production sets an observed 95% confidence level upper limit on the production cross- 
section cr(pp —» hh —> bbbb ) of 202 fb, compared to a Standard Model prediction of cr(pp —» 
hh —> bbbb) = 3.6±0.5fb. 
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1 Introduction 


The discovery of a Higgs boson ( h ) [1,2] at the Large Hadron Collider (LHC) consistent with the predictions 
of the Standard Model (SM) [3,4] motivates an enhanced elfort to search for new physics via the Higgs 
sector. Many new physics models predict rates of Higgs boson pair production significantly higher than the 
SM rate [5-7], For example, TeV-scale resonances such as the first Kaluza-Klein (KK) excitation of the 
graviton, G^ K , predicted in the bulk Randall-Sundrum (RS) model [8,9] or the heavy neutral scalar, H, of 
two-Higgs-doublet models (2HDM) [10] can decay into pairs of Higgs bosons, hh. Enhanced non-resonant 
pp —> hh production can arise in models such as those with new, light, coloured scalars [11], or direct tthh 
vertices [12,13]. 

ATLAS has carried out a search in the bbyy final state [14], setting limits on both resonant (masses 
between 260 GeVand 500 GeV) and non-resonant Higgs boson pair production. CMS has searched in 
the multi-lepton and multi-lepton+photons final-states in the context of 2HDM extensions of the Higgs 
sector [15]. CMS has also searched for narrow resonances in the 6666 channel [16]. 

Recent phenomenological studies have demonstrated that despite the fully hadronic final state being 
subject to a large multijet background, searches for new physics in the pp —> hh —> bbbb process have 
good sensitivity for both resonant [17,18] and non-resonant signals [19]. One contributing factor to this 
sensitivity is the high expected branching ratio for h —> bb. The analysis presented in this paper is designed 
to search for two high-momentum bb systems with masses consistent with /;?/,, where each bb system 
contains two jets identified as containing 6-hadrons (the jets are “6-tagged”). Compared to a more inclusive 
bbbb final-state analysis, this topology has many benefits due to the large required momentum and angular 
separation between the two bb systems: (i) excellent rejection of all backgrounds; (ii) highly efficient 
triggering using 6-tagged multijet triggers; and (iii) negligible combinatorial ambiguity in forming Higgs 
boson candidates. 

Two Higgs boson reconstruction techniques, which are complementary in their acceptance, are presented. 
The first—“resolved”—technique reconstructs Higgs boson candidates from pairs of nearby anti-6, 
jets [20] with radius parameter R = 0.4, each 6-tagged with a multivariate 6-tagging algorithm [21]. This 
resolved technique oflFers good efficiency over a wide range of Higgs boson momenta and so can be 
used to reconstruct di-Higgs-boson resonances with mass mx in the range between 500 and 1500 GeV. 
The sensitivity is best for this technique in the range 500 < mx <1100 GeV. It can be seen in Fig. 1 
however, that the acceptance for four 6-tagged anti-6, R = 0.4 jets decreases for mx < 1200 GeV. This 
loss of acceptance is due to the increased boost of the Higgs boson, which reduces the average separation 
between the 6- and 6-quarks from the Higgs boson decay, A R = y/(A//) 2 + (Af/ff 2 , to values below 0.4. 
This motivates the use of a second—“boosted”—Higgs boson reconstruction technique that maintains 
acceptance for these higher-mass resonances through the use of jet substructure techniques. The Higgs 
boson candidate is reconstructed as a single, trimmed [22] anti-6, R = 1.0 jet which must be associated 
with two 6-tagged anti-6, R = 0.3 track-jets [23]. The use of track-jets with a smaller R parameter allows 
Higgs bosons with higher transverse momentum (pj) to be reconstructed. 

The analysis is performed with the dataset recorded by ATLAS in 2012 at yfs = S TeV, corresponding to 
an integrated luminosity of 19.5 fb -1 . For the non-resonant search, a counting experiment is performed 
and the results are interpreted in the context of SM non-resonant Higgs boson pair production. This 
interpretation is only carried out for the resolved analysis due to its higher sensitivity to such a signal. 
For the resonant search, a fit to the reconstructed mass spectrum of hh candidates is carried out and the 
results are interpreted in the context of both bulk RS G' KK (spin-2) and 2HDM CP-even H boson (spin-0) 
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Figure 1: Acceptance times reconstruction and selection efficiency as a function of graviton mass for the bulk RS 
model, for the resolved and boosted analyses. The shapes of the curves are driven by the separation between the 
/?-quarks from the Higgs boson decays and the impact on jet clustering. The requirements are defined in Sects. 4.2 
and 5.2. 


production. In the bulk RS model, the fermion and boson fields can propagate in a warped extra dimension, 
which has a curvature parameter k. This benchmark model is investigated with three coupling constant 
values, k/Mpi = 1.0,1.5 and 2.0 (Mpi = Mp\/ V 87 T is the reduced Planck mass), which cover much of 
the possible parameter space [ 8 ], The 2HDMs considered have CP-conserving scalar potentials (Type-I, 
Type-II, Lepton-specific and Flipped) [10], in the regime m# = m,\ - m#±, with the potential parameter 
that mixes the two Higgs doublets ntp = m 2 tan /?/( I + tan 2 (3). Interpretations are made as a function 
of tan/3 and cos (fi - a). The parameter tan (3 is the ratio of vacuum expectation values of the two Higgs 
doublets and a is the mixing angle between the two neutral CP-even scalars. 


2 The ATLAS detector 

ATLAS is a multi-purpose particle physics experiment [24] at the LHC. The detector 1 consists of inner 
tracking devices surrounded by a superconducting solenoid, electromagnetic and hadronic calorimeters, and 
a muon spectrometer. The inner tracking system provides charged-particle tracking in the pseudorapidity 
region |//| < 2.5 and vertex reconstruction. It consists of a silicon pixel detector, a silicon microstrip tracker, 
and a straw-tube transition radiation Lacker. The system is surrounded by a solenoid that produces a 2 T 
axial magnetic field. The central calorimeter system consists of a liquid-argon electromagnetic sampling 
calorimeter with high granularity covering |?/| <3.2 and a steel/scintillator-tile calorimeter providing 
hadronic energy measurements in the central pseudorapidity range (\rj\ < 1.7). The endcap and forward 
regions are instrumented with liquid-argon calorimeters for both electromagnetic and hadronic energy 
measurements up to I/ 7 I = 4.9. The muon spectrometer is operated in a magnetic field provided by air-core 
superconducting toroids and includes tracking chambers for precise muon momentum measurements up to 
\i]\ = 2.7 and trigger chambers covering the range |//| < 2.4. A three-level trigger system is used to select 

1 ATLAS uses a right-handed coordinate system with its origin at the nominal interaction point (IP) in the centre of the detector 
and the z-axis along the beam pipe. The x-axis points from the IP to the centre of the LHC ring, and the ^-axis points 
upwards. Cylindrical coordinates (r, 0) are used in the transverse plane, 0 is the azimuthal angle around the beam pipe. The 

pseudorapidity, 77 , is defined in terms of the polar angle 9 as 77 = - ln[tan( 0 / 2 )]. 


3 




Table 1: Computed production cross-sections times branching ratio cr (jjp —>X —> hh —» bbbb'j and total widths for 
selected resonance pole mass values. The bulk RS model predictions are obtained with k / M?\ = 1.0; both cross-section 
and width grow as the square of k/Mp\. The 2HDM predictions are for a Type-II model with cos (J3 - a) - -0.2 and 
tan f3 = 1. 


Model 

Mass [GeV] 

cr x B [fb] 

r [GeV] 

Bulk RS 

1000 

1.47 

55 

Bulk RS 

1500 

0.085 

90 

2HDM 

1000 

5.54 

130 

2HDM 

1500 

0.330 

332 


interesting events [25]. The Level-1 trigger reduces the event rate to below 75 kHz using hardware-based 
trigger algorithms acting on a subset of detector information. Two software-based trigger levels, referred 
to collectively as the High-Level Trigger (HLT), further reduce the event rate to about 400 Hz using 
information from the entire detector. 


3 Data and simulation samples 

The data sample used in this analysis, after applying data quality requirements that include the availability 
of b -jet triggers, corresponds to an integrated luminosity of 19.5 ±0.5 fb -1 . The uncertainty in the integrated 
luminosity (2.8%) is derived following the same methodology as that detailed in Ref. [26]. The data sample 
is selected by a combination of five triggers requiring multiple jets or b- jets, where b-jets are identified by 
a dedicated HLT ^-tagging algorithm. This combination of triggers is >99.5% efficient for signal events 
passing the offline selection, across the full mass range considered. 

Simulated Monte Carlo (MC) event samples arc used to model the different signals, as well as the small 
background contributions from top-quark pair production (tt) and Z+jets events. The dominant multijet 
background source is estimated directly from data. Signal samples for both models studied are generated 
with Madgraph vl.5.1 [27,28], interfaced to Pythia v8.175 [29] for parton showering, hadronization and 
underlying-event simulation. The Higgs boson mass is set to 125 GeV. The CTEQ6L1 [30] leading-order 
(LO) parton distribution functions (PDFs) are used. Table 1 provides the calculated cross-sections and 
widths for different signal model parameters. The bulk RS model predictions are calculated at leading order 
using Madgraph. The 2HDM prediction corresponds to the cross-section for gluon-fusion production plus 
/^-associated production plus vector-boson-fusion production. The gluon-fusion cross-section is calculated 
using SusHi v 1.3.0 [31-36] at next-to-next-to-leading-order (NNLO) accuracy in QCD. For Zr-associated 
production, an empirical matching of the four- and five-flavour scheme is used [37], The four-flavour 
cross-section is calculated at next-to-leading-order (NLO) accuracy in QCD following Refs. [38, 39], 
while the five-flavour cross-section is calculated at NNLO in QCD using SusHi. The vector-boson- 
fusion cross-section at NNLO accuracy in QCD and NLO in electroweak is taken from Ref. [40] and 
corrected by a multiplicative factor of cos 2 (J3 - a) [10]. The 2HDM branching ratios are calculated using 
2HDMC v 1.6.4 [41]. 

For the G' KK -a hh —> bbbb signal, three sets of MC samples are generated for each of the three coupling 
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values, k/Mpi =1.0, 1.5 and 2.0. The variation in these couplings alters both the total G^ K production 
cross-section and its width. Samples cover a range of 6” KK masses 500 < m o Kl . < 2000 GeV. For the 
H —> hh —> bbbb signal, samples are generated covering the range 500 < m# < 1500 GeV. Since the 
width of H, Th, varies non-trivially with the 2HDM parameters, the samples are generated with fixed 
Th = 1 GeV. In order to make 2HDM interpretations of the results obtained with these fixed width samples, 
they are corrected to account for the true resonance width at each point in parameter space, as described in 
Sect. 6.2. 

Non-resonant SM pp —> hh —> bbbb events are generated using the exact form factors for the top loop, 
taken from HPAIR [5,42]. The cross-section is defined as the inclusive cross-section. The gluon-fusion 
production cross-section at NNLO in QCD from Ref. [7] is used, summed with the NLO predictions for 
vector-boson-fusion, top-pair-associated and vector-boson-associated production from Ref. [43]. The 
resulting cross-section is cr(pp —> hh ■—> bbbb) = 3.6 + 0.5 fb, where the uncertainty term includes the 
effects of uncertainties in the renormalization and factorization scale, PDFs, as and Br(H —» bb). 

The tt background sample is generated using Powheg-box vl.O [ 44—47 ] interfaced to Pythia v6.426 [48], 
with the top mass fixed to 172.5 GeV and the CT10 [49] NLO PDF set. The NNLO+NNLL prediction of 
253 pb for the tt cross-section [50-55] is used for normalization. Single-top background is negligible. 

The Z+jets sample is generated using Sherpa vl.4.3 [56] with the CT10 PDF set and the Z boson decaying 
to two 6-quarks. The Z+jets cross-section is taken from an NLO Powheg-box vl.O [57] plus Pythia v8. 165 
prediction, which is found to agree well with measurements in the boosted regime [58]. 

The generated MC events are processed with the GEANT4-based [59] ATLAS detector simulation [60]. 
Effects of multiple proton-proton interactions (pile-up) are simulated using Pythia v8.1 with the CTEQ6L1 
PDF set and the AU2 tune [61]. The simulated events are weighted so that the distribution of the average 
number of interactions per bunch-crossing matches that in the data. The same reconstruction software is 
used to process both the data and the simulated samples. 

Table 2 summarizes the various event generators and PDF sets, as well as parton shower and hadronization 
software used for the analyses presented in this paper. 


Table 2: Summary of MC event generators, PDF sets, parton shower and hadronization used in the analysis for 
both signal and background processes. f Madgraph was modified [62] to use the exact top-loop form-factors from 
HPAIR [5,42] for the gluon-fusion production process. 


Model / Process 

Generator 

PDF 

Parton Shower / Hadron. 

Bulk RS: pp —> G' kk —> hh —> bbbb 
2HDM: pp -» H -> hh -> bbbb 

SM: pp —* hh —> bbbb 
tt 

Z+jets 

Madgraph v 1.5.1 [27,28] 
Madgraph v 1.5.1 [27] 
Madgraph v 1.5.1 [27,62]f 
Powheg vl.O [44,45] 

Sherpa vl.4.3 [56] 

CTEQ6L1 [30] 
CTEQ6L1 [30] 
CTEQ6L1 [30] 
CT10 [49] 

CT10 [49] 

Pythia v8.175 [29] 

Pythia v8.175 [29] 

Pythia v8.175 [29] 

Pythia v6.426 [48] 

Sherpa vl.4.3 [56] 
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4 Resolved analysis 


4.1 Event reconstruction 

Jets are reconstructed from topological clusters of calorimeter cell energy deposits at the electromagnetic 
scale [63] using the anti -k t jet clustering algorithm, with a radius parameter of R = 0.4. The effects of 
pile-up on jet energies are accounted for by a jet-area-based correction [64], The jets are then calibrated 
using pj- and //-dependent calibration factors based on MC simulations and the combination of several in 
situ techniques applied to data [65], Following this, the jets undergo Global Sequential calibration [63] 
which reduces flavour-dependent differences in calorimeter response. If a muon with pj > 4 GeV and 
Ipl < 2.5 is found within a cone of A R = 0.4 around the jet axis, the four-momentum of the muon is added 
to that of the jet (after correcting for the expected energy deposited by the muon in the calorimeter). Such 
muons are reconstructed by combining measurements from the inner tracking and muon spectrometer 
systems, and are required to satisfy tight muon identification quality criteria [66]. Jets with a significant 
energy contribution from pile-up interactions [67] are removed using tracking information. For jets with 
p t < 50 GeV and |//| < 2.4, the p-\ sum of tracks matched to the jet is calculated and it is required that at 
least 50% of this pj sum is due to tracks originating from the primary vertex. 2 

Jets with \p\ < 2.5 are //-tagged using the properties of the tracks associated with them, the most important 
being the impact parameter (defined as the track’s distance of closest approach to the primary vertex in 
the transverse plane) of each track, as well as the presence and properties of displaced vertices. The MV 1 
//-tagging algorithm [21] used in this analysis combines the above information using a neural network and is 
configured to achieve an efficiency of 70% for tagging //-jets, 3 with a charm-jet rejection of approximately 
5 and a light-quark or gluon jet rejection of around 140, as determined in an MC sample of tt events. The 
//-tagging efficiency in the simulation is scaled to reproduce the one measured in data [68]. 


4.2 Selection 

The combined acceptance times efficiency (A x e) at different stages of the event selection is shown in 
Fig. 2 as a function of resonance mass for the resonant signal models. The event selection begins with 
the requirement of at least four //-tagged jets, each with pj > 40 GeV (shown in Fig. 2 as “4 //-tagged 
jets”). SM non-resonant Higgs boson pair production has a softer Higgs boson //t spectrum than the 
mx - 500 GeV resonances, resulting in a lower A x s = 4.9% for this requirement. The four highest-// [ Zz- 
tagged jets are then used to form two dijet systems, demanding that the angular distance, A R, between the 
jets in each of the dijets is smaller than 1.5. The transverse momentum of the leading (in pj) dijet system, 
p l j ad , is required to be greater than 200 GeV, while the subleading dijet system must have /x s r Llbl >150 GeV. 
In the rare case that a jet could be used to create more than one dijet which satisfies the above kinematic 
requirements, the dijet with the highest mass is chosen. Thus two unique dijet systems, with no jets in 
common, are selected (shown as “2 dijets” in Fig. 2). For SM non-resonant Higgs boson pair production, 
after this requirement, A x s = 1.2%. The impact of different decay kinematics can be seen by comparing 
Fig. 2b to Fig. 2a: the decay of spin-0 H bosons gives a softer Higgs boson pj spectrum than in the case of 


2 Proton-proton collision vertices are reconstructed requiring that at least five tracks with px > 0.4 GeV are associated with a 
given vertex. The primary vertex is defined as the vertex with the highest summed track pj,. 

3 A jet is labelled as a Zz-jet if a //-quark with transverse momentum above 5 GeV exists within a cone of A R = 0.3 around the jet 
axis. 
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Figure 2: The selection efficiency as a function of resonance mass at each stage of the event selection for (a) C' KK 
—»hh —»bbbb events and (b) H —> hh —» bbbb events in the resolved analysis. 


the spin-2 G^ K decay (due to the differing angular distributions of hh), resulting in lower acceptance for 
these kinematic requirements at low resonance mass. 

The resolved analysis considers a large range of resonance masses, 500 < mx < 1500 GeV. Due to 
the differing kinematics, the optimal selection for low-mass resonances differs from the optimum for 
higher masses. To increase the analysis sensitivity, three requirements which vary with the reconstructed 
resonance mass are used. These selection requirements are optimized simultaneously, by performing a 
three-dimensional scan of threshold values, using the statistical-only exclusion limit (Sect. 6.2) as the 
objective function. There are mass-dependent requirements (shown in Fig. 2 as “MDC”) on the minimum 
pj of the leading and subleading dijets as well as on the maximum difference in pseudorapidity, [Atgets |, 
between them. These requirements are written in terms of the four-jet mass m 4 j expressed in GeV: 




400 GeV 


if m 4 j 

>910 

GeV, 

p ] r d 

> • 

200 GeV 


if m 4 j 

< 600 

GeV, 



0.65m 4 j - 190 

GeV 

otherwise, 




260 GeV 


if m 4 j 

>990 

GeV, 

pf 

> • 

150 GeV 


if m 4 j 

< 520 

GeV, 



0.235/??4j + 28 

GeV 

otherwise, 



| A//, 


dijets 


< 


1.55 x 10“ 


m 4 j 


if m 4j < 820 GeV, 
0.27 otherwise. 


The different m 4 j thresholds shown above are chosen to obtain a continuously varying set of requirements. 
The requirement on |Az^dijets | leads to a lower acceptance for H compared to G^ K for mx > 700 GeV 
because of the effect of the boson spin on the angular distribution of its decay products. 

After selecting two dijets that satisfy the mass-dependent criteria, tt constitutes approximately 10% of the 
total background. This tt background predominantly comprises events where both top quarks decayed 
hadronically. These hadronic decays lead to three jets for each top quark: one 6-jet directly from the top 
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decay and two from the decay of the W boson. Since the probability to mis-tag charm-jets is much higher 
than the probability to mis-tag light-jets, in the majority of cases the dijet is formed from the b -jet and a 
charm-jet from the decay of the W boson. In order to reduce the tt background, jets not already used in 
the formation of the two dijets (“extra jets”) in the event are used to reconstruct W boson and top quark 
candidates by combining them with each of the dijets. These extra jets are required to have pj > 30 GeV, 
|//| < 2.5, and A R <1.5 relative to the dijet. The W boson candidates are reconstructed by adding the 
four-momentum of each of the possible extra jets to the four-momentum of the jet in the dijet system 
with the lowest probability of being a b -jet according to the multivariate ^-tagging algorithm. Top quark 
candidates are then reconstructed by summing the dijets with each of the extra jets. The compatibility with 
the top quark decay hypothesis is then determined using the variable: 



where mw and m, are the invariant masses of the W boson and top quark candidates, <r mw =0.1 mw, 
cr mi = 0.1 m ,, m w = 80.4 GeV and m, = 172.5 GeV. The values of cr mw and cr m , reflect the dijet and 
three-jet system mass resolutions. If either dijet in an event has X tt <3.2 for any possible combination with 
an extra jet, the event is rejected. This requirement reduces the ^background by ~ 60%, whilst retaining 
~ 90% of signal events (shown as “ft Veto” in Fig. 2). 

The event selection criteria described above are collectively referred to as the “4-tag” selection requirements. 
These requirements select 1891 events. 

Following the 4-tag selection, a requirement on the leading and subleading dijet masses (m ] ^ ld and bI , 
respectively) is used to define the signal region. The central value of this region corresponds to the median 
values of the narrowest dijet mass intervals that contain 90% of the MC signal (these were found to be 
stable with resonance mass). The definition of the signal region is 


Xhh = 




Gn' ead - 124 GeV'' 

2j_ 

0.1m“ 


+ 


G< bl 

2j 


- 115 GeV 


0.1m| bl 


( 1 ) 


where the 0.1 m 2 j terms represent the widths of the leading and subleading dijet mass distributions. The 
signal region is defined as Xhh < 1.6. This corresponds to the kinematical requirements illustrated by 
the inner region in Fig. 3, albeit with data from the 2-tag sample shown. It is optimized to maximize the 
expected sensitivity of the search. The acceptance times efficiency of the full selection, including this 
signal region requirement, is shown in Fig. 2 as “Signal Region”. For SM non-resonant Higgs boson pair 
production, the full selection has an A x s = 0.60%. 


The final step of the Higgs boson pair resonant production search is to perform a fit to the four-jet mass 
in the signal region. The sensitivity of this fit is increased by improving the resolution in this region, 
using the constraint that the two dijet masses should equal the Higgs boson mass, i.e. = m™ bl = m/,. 
To this end, each dijet four-momentum is multiplied by a correction factor adijet = »W m dijct- This leads to 
an improvement of ~ 30% in the signal mq resolution—with a significant reduction of the low-mass tails 
caused by energy loss—with little impact on the background. 













4.3 Background estimation 


After the 4-tag selection described above, about 95% of the remaining background in the signal region 
is expected to originate from multijet events, which are modelled using data. The remaining ~ 5% of 
the background is tt events. The tt yield is determined from data, while the m 4 j shape is taken from MC 
simulation. The Z+jets contribution is < 1% of the total background and is modelled using MC simulation. 
The background from all other sources-including processes featuring Higgs bosons-is negligible. 


4.3.1 Multijet background 

The multijet background is modelled using an independent data sample selected by the same trigger and 
selection requirements as described in Sect. 4.2, except for the /Magging requirement: only one of the two 
selected dijets has to be formed from /Magged jets, while the other dijet can be formed from jets that are 
not /Magged. This “2-tag” selection yields a data sample comprising 485377 events, 98% of which are 
multijet events and the remaining 2% are tt. The predicted contamination by the signal is negligible. 

This 2-tag sample is normalized to the 4-tag sample and its kinematical distributions are corrected for 
differences introduced by the additional /Magging. These differences arise because the /Magging efficiency 
as well as the charm- and light-jet rejection vary as a function of jet p-\ and //, the various multijet 
processes contribute in different fractions, and the fraction of events passed by each trigger path changes. 
The normalization and kinematic corrections are determined using a signal-free sideband region of the 
m^ ad -/7j2j bl plane, in dedicated samples collected without mass-dependent requirements, which increases 
the statistical precision of the kinematic corrections. The resulting background model is verified and the 
associated uncertainties are estimated using a control region. The sideband and control regions are shown 

in Fig. 3. The sideband region is defined as: - 124 GcV)~ + (/?7™ bl - 115 GcVj > 58 GeV, 

while the control region is defined as the region between the signal and sideband regions. These definitions 
are chosen to be orthogonal to the signal region and to give approximately equal event yields in the 
sideband and control regions. 

The normalization of the multijet background prediction is set by scaling the number of events in each 
region of the 2-tag sample by the following factor, pqcd, calculated in the sideband region: 


/^qcd - 


^4-tag 

iv QCD 


n: 


2 -tag 


QCD 


A^ tag _ AT 4 “ tag - At 4_tag 
iV data ly tt iV Z 

w 2 - tag _ Af 2-tag _ a r 2 ~ tag ’ 

■'''data ly tf ly Z 


( 2 ) 


where 7 V~ af ^ 4 tag is the number of events observed in the sideband region in the 2- or 4-tag data sample, 
respectively. The yields N~ f ' 4 tag and N~ /4 tag are the estimated number of tt and Z+jets events in the 
2-/4-tag selected sideband region. The tt normalization is estimated from data, as described in Sect. 4.3.2, 
while the Z+jets is estimated using MC simulation. 

To predict the distributions of the multijet background in each region, the predicted tt and Z+jets 2-tag 
distributions are first subtracted from the 2-tag data sample distribution before the distribution is scaled by 
/'QCD- 

The correction for the kinematic differences between 2-tag and 4-tag samples is performed by reweighting 
events in the 2-tag sample. The weights are derived in the sideband region from linear fits to the ratio 
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m§ ad [GeV] 


Figure 3: The distribution of the subleading dijet mass, m~ bl , vs the leading dijet mass, m \“ d , for the 2-tag data 
sample used to model the multijet background in the resolved analysis. The signal region is the area surrounded by 
the inner black contour line, centred on wljj ad = 124 GeV, m^ bl = 115 GeV. The control region is the area inside the 
outer black contour line, excluding the signal region. The sideband region is the area outside the outer contour line. 


of the total background model to data for three kinematic distributions which are found to have the 
largest disagreement between 2-tag and 4-tag events: the leading dijet pj, the A R separation between 
the jets in the subleading dijet, and the A R separation between the two dijets. The reweighting is done 
using one-dimensional distributions, but is iterated so that correlations between the three variables are 
approximately accounted for. Three iterations are found to be sufficient. After the correction process, 
there is good agreement between the background model and sideband region data in kinematic variables 
that were not explicitly corrected. Systematic uncertainties in the normalization and shape of the multijet 
background model in the signal region are assessed using control-region data, as described in Sect. 4.4. 


4.3.2 tt background 

The tt background is described using a hybrid model: the normalization is derived from data in a tt control 
sample, while the shape is taken from MC simulation because there are too few events in the tt control 
sample to describe the shape precisely enough. 

The tt control sample is formed from events which pass the 4-tag selection, except for the top veto, which 
is reversed: if either of the dijets fails the top veto, the event enters the tt control sample. This selection 
leads to a sample of 41 events within the signal region of the tt control sample, of which ~ 50% are tt and 
~ 50% multijet. The multijet background component is estimated using the same methods as used for the 
nominal selection, but with a wider control region in order to reduce the sideband region tt fraction. After 
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subtracting the multijet background, the tt control sample yield is then extrapolated to predict the tt yield 
in the nominal signal region, N t y, using the following equation: 


N» = 


1 - 6 ; 


x N c - S 

2 ly tt ’ 


(3) 


where N^ s is the number of events in the signal region, after subtraction of the multijet background, within 
the tt control sample, and e, is the efficiency for a selected dijet in a tt event to pass the top veto. This 
equation relies on the assumption that the e t of each dijet in the event is uncorrelated, an assumption 
validated in tt MC simulation. The e t is measured using an independent “semileptonic tt” data sample 
that has a high tt purity. Events in this sample are selected by requiring one dijet candidate to pass the 
nominal selection with pj > 150 GeV and one “leptonic top-quark” candidate. The leptonic top quark 
candidate is defined using a reconstructed muon and one 6-tagged jet. This 6-tagged jet is required to 
be distinct from jets in the dijet candidate, and the muon is required to have p-y > 25 GeV, be isolated, 
and fall within a cone of radius 1.2 around the 6-tagged jet. The leptonic top quark candidate is required 
to have pj >180 GeV, where the leptonic top p-y is defined as the vector sum of the 6-jet p-y, the muon 
py, and the missing transverse momentum in the event. The latter is reconstructed from energy deposits 
in the calorimeter, including corrections for identified jets, electrons and muons. The tt veto efficiency 
is then measured as the fraction of the reconstructed dijet candidates which passed the tt veto, yielding 
6 f = 0.53 ± 0.03 (stat.) ± 0.05 (syst.). The systematic uncertainty in e t is assigned to cover potential 
differences between e t as measured in the semileptonic tt sample and e t in the full 4-tag selection, where 
the method is applied in tt MC simulation to evaluate such differences. The measured e, agrees well with 
the corresponding semileptonic it MC prediction of 0.54. 


Equation (3) gives a data-driven tt background prediction of 5.2 ± 2.6 events in the signal region after 
the full selection. The uncertainty is dominated by the statistical uncertainty in V? s , with a smaller 
contribution from the uncertainty in e t . 

Due to the limited number of events in the tt control sample, the shape of the tt background is modelled 
using MC simulation. However, despite the use of a large tt sample, very few events pass the full 4-tag 
selection. Therefore, the tt shape is derived from MC simulation using the “2-tag” selection, with a 
systematic uncertainty assigned to cover differences between the 2-tag and 4-tag distributions. 


4.4 Systematic uncertainties 

Two classes of systematic uncertainties are evaluated: those affecting the modelling of the signal and those 
affecting the background prediction. 

The signal modelling uncertainties comprise: theoretical uncertainties in the acceptance, uncertainties in 
the jet energy scale (JES) and resolution (JER), and uncertainties in the 6-tagging efficiency. 

The theoretical uncertainties considered arise from initial- and final-state radiation modelling (ISR and 
FSR), PDF uncertainties and uncertainty in the LHC beam energy. These are estimated using particle- 
level samples generated using the same generator configurations as the nominal signal samples but with 
appropriate variations, assessing the difference in yields after the full analysis selection. The ISR and FSR 
uncertainty is evaluated by varying the relevant parton shower parameters in Pythia 8. The PDF uncertainty 
is estimated by taking the maximum difference between the predictions when using MSTW2008nlo [69], 
NNPDF2.3 [70] and CTEQ6L1. The uncertainty due to the beam energy is determined by varying 
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Table 3: The number of events in data and predicted background events after applying the mass-dependent require¬ 
ments in preselection and in the sideband and control regions for the resolved analysis. The uncertainties are purely 
statistical. The tt yield in this table, in contrast to the final result, is estimated using MC simulation. 


Sample 

Sideband Region 

Control Region 

Multijet 

907 ±3 

789 ±3 

tt 

25.5 ±0.3 

57.5 ± 0.4 

Z+jets 

14 ± 1 

20 ± 1 

Total 

947 ± 3 

867 ± 3 

Data 

952 

852 


coherently the energy of each beam by ±26.5 GeV [71] in the simulation. Only FSR has a significant 
impact on the acceptance, leading to a ± 1.0% theoretical modelling acceptance uncertainty. 

The JES systematic uncertainty is evaluated using 15 separate and orthogonal uncertainty components, 
which allow for the correct treatment of correlations across the kinematic bins [65]. The JER uncertainty is 
evaluated by smearing jet energies according to the systematic uncertainties of the resolution measurement 
performed with data [65]. For 6-jets with pj < 300 GeV the uncertainty in the 6-tagging efficiency is 
evaluated by propagating the systematic uncertainty in the measured tagging efficiency for 6-jets [68], 
which ranges from 2% to 8% depending on 6-jet p-\ and //. However, for the higher resonance masses 
considered in this analysis, there are a significant number of events containing at least one 6-jet with 
Pt > 300 GeV. The systematic uncertainties in the tagging efficiencies of these jets are derived from MC 
simulation and are larger, reaching 24% for p-\ > 800 GeV. 

Systematic uncertainties in the normalization and shape of the multijet background model are assessed in 
the control region. Table 3 shows the estimated background yields in the control and sideband regions. The 
control region background prediction agrees with the observed data within the data statistical uncertainty 
of ±3.5%. To further test the robustness of the background estimation and the assumptions behind 
it, predictions are made with different sideband and control region definitions and different 6-tagging 
requirements on the 2-tag sample. Redefinitions of the sideband and control region changed the kinematic 
composition of these regions, enhancing the sideband region in either high mass or low mass dijets and 
therefore altering the kinematic corrections that are applied. These variations induce a maximum change 
of ±6% in the estimated multijet yield and so the uncertainty is set to this value. Different 6-tagging 
requirements on the 6-tagged dijet in the 2-tag sample are used in order to change the composition of the 
sample and to vary the degree of 6-tagging-related kinematic bias. No additional uncertainty is required. 

The uncertainty in the description of the multijet m 4 j distribution is determined by comparing the total 
background prediction to data in the control region, as shown in Fig. 4. Good agreement in the shape is 
observed and a straight line fit to the ratio of the distributions gives a slope consistent with zero. This 
fit, along with its uncertainties, is shown in the bottom panel of Fig. 4. The uncertainty in the multijet 
background shape is defined using the uncertainty in the fitted slope. 

The uncertainty in the it normalization is dominated by the statistical uncertainty of the yield in the 
tt control sample, with a subdominant contribution from the uncertainties in the top veto efficiency, e t , 
leading to a total uncertainty of ±50%. The uncertainty in the MC-derived tt m, 4 j distribution is dominated 
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Figure 4: The four-jet mass, , distribution in the control region (points) for the resolved analysis, compared 
to the predicted background (histograms). The small filled blocks represent the statistical uncertainty in the total 
background estimate. The bottom panel shows the first-order polynomial fit to the data-to-background ratio of 
the « 4 j distribution that is used to fix the multijet shape systematic uncertainty. The dashed lines show the + l<x 
uncertainties in the two fitted parameters. 


by the uncertainty associated with using the shape after the 2-tag selection, rather than the 4-tag selection. 
This uncertainty is assessed by comparing the 2-tag to 4-tag MC predictions in the signal region. A straight 
line fit to the ratio of the normalized distributions is made and used to define a shape uncertainty in the 
same way as the multijet background. Due to the large statistical uncertainties of the 4-tag tt sample, the 
assigned shape uncertainty is large: ~ 30% and ~ 100% in the event yield at = 400 GeV and 1500 GeV, 
respectively. 

Table 4 shows the relative impact of the uncertainties in the event yields. Figure 5 shows the relative impact 
on the expected limit for cr[pp —» C/' KK —> hh —» bbbbj. The calculation of the expected limit is described 
in Sect. 6.2. It can be seen that for resonance masses below 700 GeV, the effect on the limit is dominated 
by the multijet description, with a small contribution from the tt background since both backgrounds are 
predominately at low mass. Above mx = 700 GeV, the uncertainty associated with the modelling of the 
^-tagging efficiency has the largest impact, since the larger high-pj uncertainties have an increasingly 
important effect with mass. 


4.5 Results of the resolved analysis 

Table 5 shows the predicted number of background events in the signal region, the number of events 
observed in the data, and the predicted yield for two potential signals. The numbers of predicted background 
events and observed events are in excellent agreement. 

Figure 6 shows a comparison of the predicted background distribution to that observed in the data. 
The predicted background agrees with the observed distributions, with no significant deviation. 
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Table 4: Summary of systematic uncertainties (expressed in percent) in the total background and signal event yields, 
in the signal region of the resolved analysis. Signal yield uncertainties are provided for non-resonant SM Higgs 
boson pair production and three resonances with m = 1000 GeV: a G' KK with k/My] = 1, another with k/M ? \ — 2, 
and H with fixed T w = 1 GeV. 


Source 

Bkgd 

SM hh 

°KK 

J- - l J- = 2 

Mpi Mpi 

H 

Luminosity 

- 

2.8 

2.8 

2.8 

2.8 

JER 

- 

4.5 

1.1 

1.1 

2.0 

JES 

- 

7 

1.8 

1.3 

3.4 

6-tagging 

- 

12 

22 

21 

22 

Theoretical 

- 

1.0 

1.1 

1.1 

1.1 

Multijet 

6.0 

- 

- 

- 

- 

it 

3.0 

- 

- 

- 

- 

Total 

6.7 

15 

22 

22 

23 



m G . KK [GeV] 


Figure 5: The individual impact of the systematic uncertainties considered in the resolved analysis on the expected 
cr [pp —■> G* kk —> hh —> bbbb'j 95% confidence level exclusion limit, as a function of graviton mass. The calculation 
of the expected limit is described in Sect. 6.2. Only the mass-dependent uncertainties are shown. The impact is the 
ratio of the limit calculated using all systematic uncertainties sources to the limit calculated using all systematic 
uncertainty sources excluding those under investigation. 


5 Boosted analysis 

5.1 Event reconstruction 

The boosted analysis differs from the resolved analysis primarily by the use of large-radius jets designed to 
contain the decay products of a single h —* bb decay. Those large-radius jets, denoted by the subscript J in 
the remainder of this paper, are reconstructed from locally calibrated topological clusters of calorimeter 
cells [63] using the anti-£ f jet clustering algorithm with a radius parameter of R - 1.0. To minimize the 
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Table 5: The number of predicted background events in the hh signal region for the resolved analysis, compared to 
the data. Uncertainties correspond to the total uncertainties in the predicted event yields. The yield for two potential 
signals, SM non-resonant Higgs boson pair production and a 500 GeV C' KK in the bulk RS model with k/M P \ = 1 are 
shown, with the uncertainties taken from Table 4. 


Sample 

Signal Region Yield 

Multijet 

81.4 ±4.9 

tt 

5.2 ±2.6 

Z+jets 

0.4 ± 0.2 

Total 

87.0 ±5.6 

Data 

87 

SM hh 

0.34 ± 0.05 

G^ k (500 GeV), jfe/Mpi = 1 

27 ± 5.9 
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Figure 6: Distribution of the four-jet mass, m^j, in the signal region of the resolved analysis for data (points) compared 
to the predicted background (solid histograms). The filled blocks represent the combined statistical and systematic 
uncertainty in the total background estimate. Two simulated signal peaks for the bulk RS model with k/M P \ = 1 
are shown as dashed lines. 


impact of energy depositions due to pile-up and the underlying event, the jets are trimmed [22]. This 
trimming algorithm reconstructs subjets within the large-/? jet using the k x algorithm with radius parameter 
f? su b = 0.3, then removes any subjet with pj less than 5% of the large-/? jet p \ . Further calibration of both 
the energy and mass scales is applied as a function of pj and rj as determined from simulation and in situ 
measurements [65]. 

A novel aspect of the boosted technique presented here is the use of track-jets [23] to identify the presence of 
6-quarks inside the large-/? jet. Such track-jets are built solely from tracks with p-\ > 0.5 GeV and |//| < 2.5, 
satisfying a set of hit and impact parameter criteria to make sure that those tracks are consistent with 
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originating from the primary vertex, thereby reducing the effects of pile-up. Track jets are reconstructed 
using the anti -k t algorithm with R = 0.3. Flavour-tagging of those track-jets proceeds in the same way as 
for the R = 0.4 calorimeter jets used in the resolved analysis described in the previous section, except for a 
slightly looser requirement on the output of the MV 1 neural network for a track-jet to be ^-tagged. This 
leads to b -jets being ^-tagged with an efficiency of 74%, with a charm-jet rejection factor of approximately 
4 and a light-quark or gluon jet rejection factor of around 65, as determined in an MC sample of tt events. 
The ^-tagging efficiency for track-jets in the MC simulation is adjusted based on studies of tt events in the 
data (Sect. 5.4). 


5.2 Selection 

The combined acceptance times efficiency at different stages of the event selection for the boosted analysis 
is shown in Fig. 7. 

Events are required to contain at least two large-R jets with pj > 250 GeV and |//| < 2.0. To suppress 
contamination from tt events, the leading jet is additionally required to have pj > 350 GeV. This ensures 
that the top-quark decay products are typically fully contained in a single large-/'? jet with mass close to 
that of the top quark. These requirements are shown in Fig. 7 as “2 large-R jets”. Only the leading and 
subleading large-R jets are retained for further consideration. 

Track jets are associated with large-R jets using “ghost association” [64,72,73]. Each of the leading and 
subleading large-R jets must have at least two track-jets ghost-associated with their respective untrimmed 
parents, where the track-jets must have pj > 20 GeV and |//| < 2.5, as well as be consistent with originating 
from the primary vertex of the event (shown in Fig. 7 as “4 track-jets”). The drop in the Axe value at 
masses above 1500 GeV is due to the decrease in the angular separation between the two track-jets from 
the h —> bb decay to below AR = 0.3. 

To suppress contamination from multijet events, the two selected large-R jets in the event are required to 
have a separation | Arj\ < 1.7. This requirement (shown in Fig. 7 as “A 77 ”) has only a small impact on the 
signal acceptance since high-mass resonances tend to produce jets that are more central than those from 
multijet background processes. 

Selection of h —* bb candidates proceeds by requiring that both the leading and subleading track-jets 
associated with each of the two large-R jets satisfy the ^-tagging selection (shown in Fig. 7 as “4 b-tagged 
jets”). 

A final correction to the large-R jet four-momentum is applied to account for semileptonic b -hadron decays. 
If a muon passing the requirements outlined in Sect. 4.1 is ghost-associated with any of the selected 
^-tagged track-jets, its four-momentum is added to that of the large-R jet. If more than one muon is 
associated with a given track-jet, the muon closest to the track-jet axis is used. This correction improves 
the mass resolution for large-R jets in signal MC simulation, especially for the subleading jet. 

The last requirement used to select signal event candidates is to require that the large-R jet mass is consistent 
with the Higgs boson mass. This requirement is defined identically to that for the resolved analysis in 
Eq. (1), except for the replacement of the small-R dijet mass with the large-R jet mass. The signal region is 
defined by the requirement X^h < 1.6. This final selection is shown in Fig. 7 as “Signal Region”. 
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(a) Bulk RS, k/M Pl = 1 (b) pp -> H hh ->bbbb with fixed T H = 1 GeV 

Figure 7: The selection efficiency as a function of resonance mass at each stage of the event selection for (a) G’' KK 
—> hh —> bbbb events and (b) H —> hh —> /?/?/?/? events in the boosted analysis. 


5.3 Background estimation 

After the 4-tag selection described in Sect. 5.2, the background composition is similar to that of the 
resolved analysis. Multijet events comprise approximately 90% of the total background and are modelled 
entirely using data. The remaining ~ 10% of the background is tt events. The tt yield is determined using 
data, while the m 2 j shape is taken from MC simulation. The Z+jets contribution is < 1% of the total 
background and is modelled using MC simulation. The background from all other sources—including 
processes featuring Higgs bosons—is negligible. 

Estimation and validation of the background described below relies on two data samples defined as 
follows. 

• The “4-tag” sample corresponds to the set of events that satisfy all the requirements detailed in 
Sect. 5.2, except that the final requirement on the mass of the leading and subleading large-/? jets is 
not applied. 

• The “2+3-tag” sample is identical to the 4-tag sample except for having only two or three of the four 
track-jets ^-tagged. For events with only two ^-tagged track-jets, both are required to be associated 
with the same large-/? jet. 

Both samples are further subdivided based on the large-/? jet masses, with each subsample having a 
sideband region to determine the multijet background kinematics and a control region to validate the 
background estimate. The control region is defined by requirements on the mass of the leading and 
subleading large-/? jets of 95 < ml ead < 160 GeV and 85 < mj ubl < 155 GeV respectively, while excluding 
the signal region defined by A/,/, < 1.6. The sideband region is complementary to the signal and control 
regions. Figure 8 illustrates the sideband and control regions with data from the 2+3-tag sample. 

The choice of control region (and consequently sideband region) ensures that the multijet background can 
be estimated by extrapolation of event yields and kinematic properties from the 2+3-tag sample to the 
4-tag sample with a normalization given by the relative event yields in the sideband region. Furthermore, 
the control region is chosen such that event kinematics in that region are representative of the kinematics 
in the signal region. 
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Figure 8: The leading-subleading large-/? jet mass distribution for the 2-tag and 3-tag data sample in the boosted 
analysis. The signal region is the area surrounded by the inner black contour line, centred on ;nj ead = 124 GeV and 
Wj ubl = 115 GeV. The control region is the area inside the outer black contour line, excluding the signal region. The 
sideband region is the area outside the outer contour line. 


The estimated background yield in the 4-tag sample, /V^ ag 


, is computed according to 


="wo <cd ■“+««-+ 4-“ . <4) 

where N^ D tag is the number of multijet events in the 2+3-tag data sample, N^~ tag and /V^ tag are the 
numbers of events in the 4-tag tt and Z+jets MC samples. The parameter //qcd corresponds to the ratio of 
multijet event yields in the 4-tag and 2+3-tag data samples, as defined in Eq. (2), except for including both 
2- and 3-tag events in the denominator. Finally, the parameter a t j is a scale factor designed to adjust the tt 
event yield from the MC simulation. Both /tqcd and a a are extracted from a binned likelihood fit to the 
leading large-/? jet mass distribution obtained in the sideband region of the 4-tag data sample, as depicted 
in Fig. 9. Due to the large minimum pj requirement for the leading large-/? jet, much of the tt contribution 
is concentrated at high mass close to the top-quark mass. In this fit, the multijet distribution is extracted 
from the 2+3-tag data sample, after subtraction of the tt and Z+jets contributions predicted by the MC 
simulation. The tt and Z+jets distributions in the sideband region of the 4-tag data sample are taken from 
the MC simulation, but the Z+jets contribution is very small and its distribution is added to the multijet 
distribution for the fit. The resulting fit values are pqcd = 0.0071 + 0.0007 and a t j - 1.44 ± 0.50 with a 
correlation coefficient of -0.67 between these two parameters. 

Figure 10a shows the dijet mass distribution for the 4-tag data sample in the sideband region with the 
background estimated using the above method. This figure indicates that the 2+3-tag sample provides a 
valid description of the background kinematics in the 4-tag sample. The modelling of the background 
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Figure 9: Leading large-/? jet mass distribution for 4-tag events in the sideband region for data (points) and the two 
dominant sources of background for the boosted analysis. The normalization for each of those two background 
components is obtained with a fit to the data as described in the text. 


yield and kinematics is further validated by testing in the control region of the 4-tag data sample. Good 
agreement is observed between the data and the predicted background in various kinematical distributions 
for leading and subleading large-/? jets, as well as for the dijet mass, as shown in Fig. 10b. The shapes of 
the tt kinematical distributions in the signal region are determined from the MC simulation requiring only 
three ^-tagged track-jets instead of four due to the limited MC sample size. The number of tt events is then 
normalized to the expected yield in the 4-tag sample times a t j. It was checked that this does not introduce 
a bias discernible with the statistical precision of the tt MC sample. 
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Figure 10: Dijet mass distributions for 4-tag events in the boosted analysis, (a) shows the sideband region and (b) the 
control region for data (points) and the expected background (histograms). The filled blocks represent the statistical 
uncertainty in the total background estimate. 


19 

















































5.4 Systematic uncertainties 

Systematic uncertainties can be grouped into two classes: those affecting modelling of the signal as 
extracted from simulation and those arising from the background estimate. 

The signal modelling is affected by two main sources of experimental uncertainty. One is related to 
large-R jets and the other is related to the efficiency for 6-tagging track-jets. For large-/? jets, the following 
uncertainties are accounted for: jet energy scale and resolution, as well as jet mass scale (JMS) and 
resolution (JMR). In the kinematic region relevant to this analysis, the JES uncertainty is below 2% and 
that for JMS is ~ 2-5%. The ES uncertainty is derived with the y-jet balance method for pj < 800 GeV 
and the track-jets double-ratio method for p \ > 800 GeV, as described in Ref. [72], The latter method is 
also used for the derivation of JMS uncertainties in the full p \ range. An uncertainty of 20% is applied to 
account for modelling of the jet energy and mass resolutions. The magnitude of this resolution uncertainty 
is estimated from studies of boosted W boson production performed using the 2012 data. Jet energy and 
mass uncertainties are treated as uncorrelated in the statistical analysis. The uncertainty in modelling the 
/Magging efficiency for the track-jets used in this analysis is applied to the signal and Z+jets MC samples. 
It is extracted as a function of pj using the tag-and-probe method on a sample of dilepton events from 
semileptonic tt decays. The resulting uncertainty varies between 2% and 7%, with the largest value obtained 
for track-jets with px > 100 GeV. This uncertainty includes the following effects: statistical precision 
of the calibration data sample, choice of event generator and parton shower for the simulated tt sample, 
initial- and final-state radiation, and flavour composition. For px > 250 GeV, the uncertainties must be 
evaluated using MC simulation due to the small number of data events. Consequently, the uncertainties 
increase, reaching 14% for px > 600 GeV. Studies in a tt data sample with a single-lepton+jets final state 
indicate that the presence of nearby jets does not have a measurable effect on the 6-tagging efficiency and 
thus no additional uncertainty is required for nearby jets. 

In addition to purely statistical sources of uncertainty, the background estimate is sensitive to the following 
other sources. The multijet background normalization is validated with the observed yield in the control 
region and the statistical uncertainty of this test is included as a systematic uncertainty. The shape of the 
tt background used in the fit shown in Fig. 9 is varied by extracting the shape from MC samples with 
zero, one, two or three 6-tagged track-jets. Similarly, the uncertainty in the shape of other tt kinematical 
distributions is extracted from those samples. The uncertainty in the shape of the multijet background 
extracted from the sideband region of the 2+3-tag sample is constrained by the level of agreement between 
the background prediction and the observed data in the control region following the procedure described in 
Sect. 4.4. Good agreement is observed between the data and the predicted background in both the sideband 
and control regions of the 4-tag sample as shown in Table 6. 

Systematic uncertainties in both the background and signal event yields are summarized in Table 7. 
A 2.8% luminosity uncertainty is applied to the Z+jets background and to the signal samples. The 
JER/JES/JMR/JMS uncertainties are applied to signal, tt and Z+jets samples. The track-jet 6-tag uncertainty 
is applied only to the signal samples as the normalization and shape differences in the tt sample are 
accounted for through other sources of systematic uncertainty. 

Theoretical uncertainties affecting the signal acceptance are also considered, as described in Sect. 4.4. These 
sources do not have significant dependence on the assumed resonance mass and the largest contribution is 
found to be due to the ISR modelling. 

The uncertainty in the multijet event yield is derived from the difference between the predicted and 
observed multijet yields in the control region. This source of uncertainty is dominated by the statistical 
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Table 6: The number of events in data and predicted background events in the sideband and control regions of the 
4-tag sample for the boosted analysis. The uncertainties are purely statistical. 


Sample 

Sideband Region 

Control Region 

Multijet 

221 ± 1 

53.8 ±0.6 

tt 

52.8 ±0.6 

9.8 ± 0.3 

Z+jets 

3.80 ± 0.26 

1.57 ±0.17 

Total 

278 ± 1 

65.2 ± 0.7 

Data 

281 

68 


Table 7: Summary of systematic uncertainties (expressed in percent) in the total background and signal event yields in 
the signal region for the boosted analysis. Uncertainties are provided for a resonance mass of 1.5 TeV in the context 
of the bulk RS model with k/Mp\ = 1 or 2, as well as for a Type-II 2HDM with T H = 1 GeV, cos (/3 - a) = -0.2 and 
tan = 1. 


Source 

Bkgd 

k/Mpi = 

^KK 

1 k/Mpi = 2 

H 

Luminosity 

0.2 

2.8 

2.8 

2.8 

JER 

0.9 

0.1 

0.2 

0.1 

JES 

0.4 

0.1 

2.5 

0.1 

JMR 

4.3 

13 

13 

12 

JMS 

1.3 

18 

17 

16 

^-tagging 

- 

21 

20 

21 

Theoretical 

- 

2.0 

2.0 

2.0 

Multijet 

12 

- 

- 

- 

tt 

2.5 

- 

- 

- 

Bkgd stat. 

8.9 

- 

- 

- 

Total 

15.9 

33 

28 

30 


uncertainty in that region. The tt entry in Table 7 accounts for the shape uncertainty in the simulated 
tt leading-jet mass distribution in the sideband region used to fit for pqcd and a,j. This uncertainty is 
determined by comparing the shape of the 4-tag and 2-tag tt distributions. Finally, the “Bkgd stat” accounts 
for the statistical uncertainties in the extraction of /uqcd and a,j. Uncertainties in the m 2 j shape of the 
multijet and tt backgrounds are not listed in Table 7, as they do not affect the event yield, but are accounted 
for in the statistical analysis. 

Figure 11 presents the impact of each source of systematic uncertainty on the expected cross-section limit 
for the production of 6'" KK as a function of resonance mass with the choice k/Mpi = 1- These values are 
obtained following the statistical analysis described below while neglecting each source of uncertainty 
in turn. The multijet background uncertainty dominates for resonance masses below 1000 GeV, with 
^-tagging, large-7? jet mass and the number of sideband data events for the background estimate becoming 
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Table 8 : The number of predicted background events in the hh signal region, compared to the data for the boosted 
analysis. Errors correspond to the total uncertainties in the predicted event yields. The yield for a 1000 GeV C' KK in 
the bulk RS model with k/M ? 1 = 1 is also given. 


Sample 

Signal Region Yield 

Multijet 

23.5 ±4.1 

tt 

2.2 ± 0.9 

Z+jets 

0.14 ±0.06 

Total 

25.7 ± 4.2 

Data 

34 

G* KK (1000 GeV), k/Mpi = 1 

2.1 ± 0.6 
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: The individual relative impact of the systematic uncertainties considered in the boosted analysis on the 
cr [pP * G'k K —> hh —> hhhh'j 95% confidence level exclusion limit, as a function of graviton mass. The 
calculation of the expected limit is described in Sect. 6.2. Only the mass-dependent uncertainties are shown. The 
impact is the ratio of the limit calculated using all systematic uncertainties sources to the limit calculated using all 
systematic uncertainty sources excluding those under investigation. 
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5.5 Results of the boosted analysis 

A total of 34 events is observed in the data whereas the background expectation is estimated to be 25.7 ±4.2, 
see Table 8 for a breakdown of the various sources of background. The significance of this excess of events 
in the data is evaluated below. 

The dijet mass distribution in the signal region is shown in Fig. 12. For this distribution and the statistical 
analysis, the estimated background prediction from multijet (ft) events is tit to an exponential function at 
masses above 900 (800) GeV and the associated uncertainty is propagated to the statistical analysis. 
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Figure 12: Dijet mass distributions for data (points) as well as expected background (solid histograms) in the signal 
region of the boosted analysis. The filled blocks represent the combined statistical and systematic uncertainty in the 
total background estimate. Two simulated GJ^ signal peaks predicted by the bulk RS model with k/Mp\ — 1 are also 
shown as dashed lines. 


6 Results 

The results from the analyses in Sects. 4.5 and 5.5 are interpreted separately using the statistical procedure 
described in Ref. [1] and references therein. Hypothesized values of p, the global signal strength factor, 
are tested with a test statistic based on the profile likelihood ratio [74,75]. In the profile likelihoods, the 
maximum likelihood values are obtained with the systematic uncertainties treated as independent, Gaussian 
or log-normal constraint terms. The statistical analysis described below is performed using data from the 
signal region solely. In the case of the search for non-resonant hh production, only the number of events 
passing the final selection is used whereas the m 4 j or «? 2 .i distributions are used in the case of the search for 
hh resonances. 


6.1 Background-only hypothesis tests 

Tests of the background-only hypothesis (p = 0) are carried out to determine if there are any statistically 
significant local excesses in the data. The significance of an excess is quantified using the local pa, the 
probability that the background could produce a fluctuation greater than or equal to the excess observed 
in data. A global pa is also calculated for the most significant discrepancy, using background-only 
pseudo-experiments to derive a correction for the look-elsewhere effect across the mass range tested. 

In the case of the resolved analysis, the largest deviation from the background-only hypothesis is found to 
be 2.1 cr for a pp —* H —> lih —> bbbb signal with fixed T# = 1 GeV at m 4 j = 1200 GeV. This corresponds 
to a global significance of 0.42 cr. The significance of any deviation for a G' KK signal with k/Mp\ = 1 is 
very similar, albeit with slightly smaller local discrepancies as a result of the larger signal width. 
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In the case of the boosted analysis, the largest local deviation corresponds to the data excess at /?i 2 j ~ 900 GeV 
apparent in Fig. 12, with a local significance of 2.6 c r for pp —» G^ K —>hh —»bbbb with k/Mp\ = 1. The 
global significance of this deviation corresponds to 0.78 cr. 

Given these low significance values, the results of both analyses are consistent with the background-only 
hypothesis. Of the 117 events selected in the data by either the resolved or boosted analysis, only four 
events are common to both. 


6.2 Exclusion limits 

The data are used to set upper limits on the cross-sections for the different benchmark signal processes. 
Exclusion limits are based on the value of the statistic CL S [76], with a value of p regarded as excluded at 
95% confidence level (CL) when CL S is less than 5%. 

The non-resonant search is performed using the resolved analysis, since it has better sensitivity than the 
boosted analysis. Using the SM hh non-resonant production as the signal model, the observed 95% CL 
upper limit is cr{pp —» hh —> bbbb) = 202 fb. This can be compared to the inclusive SM prediction (as 
defined in Sect. 3) of <r(pp —> hh —» bbbb) = 3.6 ± 0.5 fb. 

Lor the resonant Higgs boson pair production search, the resolved and boosted analyses offer their best 
sensitivity in complementary resonance mass regions. Ligure 13 shows the expected and observed cross- 
section upper limits from each analysis for pp —» G^ K —> hh —> bbbb within the bulk RS model with 
k/Mp\ = 1. The resolved analysis can be seen to give a more stringent expected exclusion limit for 
resonance masses up to 1100 GeV, while the boosted analysis offers better sensitivity beyond that mass. 
This motivates a simple combination of the separate exclusion limits from the resolved and boosted 
analyses. Lor each of the signal models, the limit for each mass point is taken from the analysis which 
offers the most stringent expected exclusion. 
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Expected Limit (95% CL) 
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(a) Resolved Analysis 


(b) Boosted Analysis 


(c) Overlay of expected limits 


Figure 13: The expected and observed limits for the bulk RS model with k/Mp\ = 1 for (a) the resolved analysis and 
(b) the boosted analysis. The overlay of expected limits is shown in (c), demonstrating that the resolved analysis 
gives better sensitivity for m c ^ < 1100 GeV, while the boosted analysis is better for mc* KK > 1100 GeV. The red 
curves show the predicted cross-section as a function of resonance mass for the model considered. 


Figure 14 shows the combined 95% CL upper limits for three signal models: pp —> G' KK —» hh —» bbbb 
within the bulk RS model with k/M?\ = 1 and 2, and the pp —» H —»hh —> bbbb with a fixed Yji = I GeV. 
The most stringent limits of o~(pp —> X —> hh — * bbbb} ~ 3 fb are set in the range 900 < mx < 1600 GeV, 
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(a) Bulk RS, k/M PI = 1 (b) Bulk RS, k/M n = 2 (c) pp -> H -> hh with fixed T„ = 

1 GeV 


Figure 14: The combined expected and observed limit for pp —> C' KK —> hh —> hhhh in the bulk RS model with (a) 
k/M P ] = 1 and (b) k/M P] = 2, as well as (c) pp —> H —> hh —> bbbb with fixed Y H = 1 GeV. The red curves show the 
predicted cross-sections as a function of resonance mass for the models considered. 


Table 9: Range of KK graviton masses excluded at 95% confidence level for kjM\>\ = 1.0,1.5 and 2.0. 


k/Mpi 

95% CL Excluded G^ K Mass Range [GeV] 

1.0 

500 - 720 

1.5 

500-800 and 870-910 

2.0 

500 - 990 


where there is little expected background and either the resolved or boosted analysis provides good signal 
acceptance. The excluded mass ranges for the bulk RS KK graviton are shown in Table 9. 

The excluded mass range for the 2HDM is parameter dependent, principally because the production 
cross-section varies, but also because the exclusion limit depends on the parameter-dependent H boson 
width, Yh- The theoretical cross-section used to determine the 95% CL excluded regions is the sum of the 
cross-sections of gluon-fusion production, vector-boson-fusion production and ^-associated production. 

The effects of T// are accounted for by creating /;?// distributions with a range of widths, 0 < T< 0.5, 
for each mu considered. These distributions are based on parameterizations which include resolution and 
acceptance effects combined with a Breit-Wigner line-shape. A grid of limits are calculated with each 
of these mass distributions. Then, for each point in mu, cos (J3 - a), and tan/3 space, the cross-section 
limit is determined by interpolating between the appropriate limits, based on the Yu given by the model 
for that point. For the widest signals considered, the exclusion limits worsen by up to a factor of three. 
The exclusion regions determined through this process are shown as a function of cos (J3 - a) and tan/3 for 
mu = 500 GeV in Figs. 15 and 16, and as a function of mu and tan/3 for cos (/3 - a) = -0.2 in Figs. 17 
and 18. The validity of the process has been tested using the widest available signals, gravitons in the bulk 
RS model with k jM\>\ = 2. Phase-space regions with Yu greater than these graviton widths are considered 
unvalidated and are shown in the figures as grey areas. 
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(a) Type-I 2HDM, m H = 500 GeV (b) Type-II 2HDM, m H = 500 GeV 


Figure 15: Excluded regions of the (cos (J3 - a), tan/j) parameter space for (a) the Type-I 2HDM signal model and 
(b) the Type-II 2HDM signal model. The grey areas demarcate the phase-space regions where F h/wh > 0.15, for 
which the cross-section limits have not been demonstrated to be reliable. 




-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 

cos(|3-a) 


(a) Lepton-specific 2HDM, m H = 500 GeV 


(b) Flipped 2HDM, m„ = 500 GeV 


Figure 16: Excluded regions of the (cos (J3 - a ), tan/l) parameter space for (a) the Lepton-specific 2HDM signal 
model and (b) the Flipped 2HDM signal model. The grey areas demarcate the phase-space regions where Th/hih > 
0.15, for which the cross-section limits have not been demonstrated to be reliable. 
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Figure 17: Excluded regions of the ( m H , tan/?) parameter space for (a) the Type-I 2HDM signal model and (b) the 
Type-II 2HDM signal model. The grey areas demarcate the phase-space regions where Yh/hih is large ( Yh/ihh > 0.15 
for iiih = 500 GeV, increasing to T H /m H > 0.23 for = 1100 GeV) and the limits have not been demonstrated to 
be reliable. 
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(a) Lepton-specific 2HDM, cos (J3 - a) = -0.2 


(bj Flipped 2HDM, cos (J3 - a) = -0.2 


Figure 18: Excluded regions of the (mn, tan/?) parameter space for (a) the Lepton-specific 2FIDM signal model 
and (b) the Flipped 2HDM signal model. The grey areas demarcate the phase-space regions where Th/hih is large 
(Y H /m H >0.15 for = 500 GeV, increasing to \ ' H lm H > 0.23 for m H = 1100 GeV) and the limits have not been 
demonstrated to be reliable. 
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7 Conclusions 


Two searches for Higgs boson pair production with the ATLAS detector at the LHC using the bbbb final 
state have been presented: one reconstructs Higgs boson candidates from pairs of nearby anti-/:, /Magged 
jets with R = 0.4; the other reconstructs Higgs boson candidates using trimmed anti-/:, jets with R = 1.0 
matched to two /Magged anti -k t track-jets with R = 0.3. Thanks to the high expected h —> bb branching 
ratio and the large background rejection factors offered by the boosted dijet topology, the sensitivity for 
Higgs boson pair production is high, with a mass reach spanning the range between 500 and 2000 GeV. 
There is no evidence for any signal in 19.5 fb _1 of pp collision data with -\/y = 8 TeV. The largest deviation 
from the background-only hypothesis has a global significance of only 0.78 cr. The observed 95% CL 
upper limit on o~(pp —> X —>hh —a bbbb'j is 3.2 (2.3) fb for narrow resonances with a mass of 1.0 (1.5) 
TeV. 

Constraints are placed on several benchmark models. For the bulk RS model with k/M?\ = 1, KK gravitons 
in the mass range 500 < nio KK < 720 GeV are excluded at the 95% CL. For non-resonant signals, using 
Standard Model hh non-resonant production as the benchmark, the observed 95% CL upper limit on 
<r(pp —» hh —> bbbb ) is 202 fb, in good agreement with the expected exclusion. This is to be compared to 
a SM prediction of 3.6 + 0.5 fb. 
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